NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Counterexamples in Safe Rust

https://doi.org/10.1145/3691621.3694943

Hassnain, Muhammad; Stanford, Caleb (October 2024, ACM)

The Rust programming language is a prominent candidate for a C and C++ replacement in the memory-safe era. However, Rust’s safety guarantees do not in general extend to arbitrary third-party code. The main purpose of this short paper is to point out that this is true even entirely within safe Rust – which we illustrate through a series of counterexamples. To complement our examples, we present initial experimental results to investigate: do existing program analysis and program veri!cation tools detect or mitigate these risks? Are these attack patterns realizable via input to publicly exposed functions in real-world Rust libraries? And to what extent do existing supply chain attacks in Rust leverage similar attacks? All of our examples and associated data are available as an open source repository on GitHub. We hope this paper will inspire future work on rethinking safety in Rust – especially, to go beyond the safe/unsafe distinction and harden Rust against a stronger threat model of attacks that can be used in the wild.
more » « less
Full Text Available
Vision Paper: Proof-Carrying Code Completions

https://doi.org/10.1145/3691621.3694932

Kamran, Parnian; Devanbu, Premkumar; Stanford, Caleb (October 2024, ACM)

Code completions produced by today’s large language models (LLMs) offer no formal guarantees. We propose proof-carrying code completions (𝑃𝐶3). In this paradigm, a high-resourced entity (the LLM provided by the server) must provide a code completion to- gether with a proof of a chosen safety property which can be inde- pendently checked by a low-resourced entity (the user). In order to provide safety proofs without requiring the user to write specifica- tions in formal logic, we statically generate preconditions for all dangerous function calls (i.e., functions that may violate the safety property) which must be proved by the LLM. To demonstrate the main ideas, we provide a prototype imple- mentation in the program verification language Dafny, and a case study focusing on file system vulnerabilities. Unlike Python code generated by GPT-4, Dafny code generated by 𝑃𝐶3 provably avoids a common weakness related to path traversal (CWE-35), using a single generation attempt (𝑘= 1)and a modest number of to- kens (3,350). Our tool is available as an open source repository at https://github.com/DavisPL/PCCC.
more » « less
Full Text Available
Zero-Cost Capabilities: Retrofitting Effect Safety in Rust

Berdovskiy, George; Stanford, Caleb (January 2024, POPL 2024 Student Research Competition)
Zhang, Danfeng; Krishnaswami, Neel (Ed.)
Over the last several years, the Rust programming language has gathered a following among software developers for its robust memory safety features. Nevertheless, it remains susceptible to potentially harmful side effects in untrusted code and is therefore vulnerable to supply chain attacks. We wish to investigate whether preventing them by retroactively enforcing side effect safety is possible. In this extended abstract, we introduce Coenobita, a Rust library that prevents undesirable side effects using capabilities without additional performance overhead. Our goal was to implement statically enforced, zero-cost, and unobtrusive capabilities. To evaluate Coenobita’s practicality and effectiveness, we conducted two case studies porting popular Rust crates walkdir and remove_dir_all to Coenobita. Porting walkdir required modifying or adding around 242 lines across three files originally containing 1800 lines total. Benchmarks were run on 46 tests provided in walkdir and their equivalents in coenobita-walkdir, demonstrating little change in runtime for most tests.
more » « less
Full Text Available
A Robust Theory of Series Parallel Graphs

https://doi.org/10.1145/3571230

Alur, Rajeev; Stanford, Caleb; Watson, Christopher (January 2023, Proceedings of the ACM on Programming Languages)

Motivated by distributed data processing applications, we introduce a class of labeled directed acyclic graphs constructed using sequential and parallel composition operations, and study automata and logics over them. We show that deterministic and non-deterministic acceptors over such graphs have the same expressive power, which can be equivalently characterized by Monadic Second-Order logic and the graded µ-calculus. We establish closure under composition operations and decision procedures for membership, emptiness, and inclusion. A key feature of our graphs, called synchronized series-parallel graphs (SSPG), is that parallel composition introduces a synchronization edge from the newly introduced source vertex to the sink. The transfer of information enabled by such edges is crucial to the determinization construction, which would not be possible for the traditional definition of series-parallel graphs. SSPGs allow both ordered ranked parallelism and unordered unranked parallelism. The latter feature means that in the corresponding automata, the transition function needs to account for an arbitrary number of predecessors by counting each type of state only up to a specified constant, thus leading to a notion of counting complexity that is distinct from the classical notion of state complexity. The determinization construction translates a nondeterministic automaton with n states and k counting complexity to a deterministic automaton with 2 n 2 states and kn counting complexity, and both these bounds are shown to be tight. Furthermore, for nondeterministic automata a bound of 2 on counting complexity suffices without loss of expressiveness.
more » « less
Full Text Available
Stream processing with dependency-guided synchronization

https://doi.org/10.1145/3503221.3508413

Kallas, Konstantinos; Niksic, Filip; Stanford, Caleb; Alur, Rajeev (March 2022, PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)

Full Text Available
DiffStream: differential output testing for stream processing programs

https://doi.org/10.1145/3428221

Kallas, Konstantinos; Niksic, Filip; Stanford, Caleb; Alur, Rajeev (November 2020, Proceedings of the ACM on Programming Languages)
null (Ed.)
Full Text Available
Synchronization Schemas

https://doi.org/10.1145/3452021.3458317

Alur, Rajeev; Hilliard, Phillip; Ives, Zachary G.; Kallas, Konstantinos; Mamouras, Konstantinos; Niksic, Filip; Stanford, Caleb; Tannen, Val; Xue, Anton (June 2021, ACM Symposium on Principles of Database Systems)
null (Ed.)
Full Text Available
Streamable regular transductions

https://doi.org/10.1016/j.tcs.2019.11.018

Alur, Rajeev; Fisman, Dana; Mamouras, Konstantinos; Raghothaman, Mukund; Stanford, Caleb (February 2020, Theoretical Computer Science)

Full Text Available
Modular quantitative monitoring

https://doi.org/10.1145/3290363

Alur, Rajeev; Mamouras, Konstantinos; Stanford, Caleb (January 2019, Proceedings of the ACM on Programming Languages)

Full Text Available
Data-trace types for distributed stream processing systems

https://doi.org/10.1145/3314221.3314580

Mamouras, Konstantinos; Stanford, Caleb; Alur, Rajeev; Ives, Zachary G.; Tannen, Val (January 2019, 40th ACM SIGPLAN Conference on Programming Language Design and Implementation)

Distributed architectures for efficient processing of streaming data are increasingly critical to modern information processing systems. The goal of this paper is to develop type-based programming abstractions that facilitate correct and efficient deployment of a logical specification of the desired computation on such architectures. In the proposed model, each communication link has an associated type specifying tagged data items along with a dependency relation over tags that captures the logical partial ordering constraints over data items. The semantics of a (distributed) stream processing system is then a function from input data traces to output data traces, where a data trace is an equivalence class of sequences of data items induced by the dependency relation. This data-trace transduction model generalizes both acyclic synchronous data-flow and relational query processors, and can specify computations over data streams with a rich variety of partial ordering and synchronization characteristics. We then describe a set of programming templates for data-trace transductions: abstractions corresponding to common stream processing tasks. Our system automatically maps these high-level programs to a given topology on the distributed implementation platform Apache Storm while preserving the semantics. Our experimental evaluation shows that (1) while automatic parallelization deployed by existing systems may not preserve semantics, particularly when the computation is sensitive to the ordering of data items, our programming abstractions allow a natural specification of the query that contains a mix of ordering constraints while guaranteeing correct deployment, and (2) the throughput of the automatically compiled distributed code is comparable to that of hand-crafted distributed implementations.
more » « less
Full Text Available

Search for: All records